Figure 1a 

input U and D (resampling ratio U/D) 



length 4*max(U,D) samples of windowed sine filter; 
compute U output phase coefficients (data access block) 



pick multiply factor if U smalll 
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first/next architecture kernel pick first estimate of data 
step per group as integer of (kernel height)*D/U 
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first/next estimate of data step per group about first 
estimate, compute starting point and number of taps 
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if number of taps is less than prior best (initial is 
maximum number), retain as best 




best architecture kernel, data step per 
group, and sub-filter length 
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Figure 1b (DSC components) 
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Figure 2a 
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Figure 2c 
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For polyphase filtering implementation of this upsampling, the data access pattern is shown below. 
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Architecture Kernel 


Explanation 


Example 


single point 


Single-thread filtering, getting one data point at a 




O 


time. The next access can stay on the same point, 
or move for a fixed distant. This is the most 
flexible pattern. However, there is usually time 
associated with each level of looping, which is 
needed to implement changing of stepping distance 
in a regular manner. 

A parallel DSP performing vertical filtering often 
use this single-point kernel, as its parallelism is 
applied to different filtering problems (for vertical 
filtering, each column is operated independently 

anH ic ttiuc n Qpnnratp r.rr.K.pm ^ 


This is for 5-tap-per-output filtering going 
on for 7 outputs, for one data access block 
for 7/D resampling 


4-wide 


DSP with fixed, 4-wide, data access, and capability 




OOOO 


to compute inner-product with coefficient array and 
producing a single sum. 

To be efficient for filtering, the starting point should 
be on any alignment. 


This is 4-tap-per-output filtering for a 6/D 
resampling 


5 oooo 


DSP with 4 parallel execution units, and they are all 
fed with the same single data point. 

There is a significant distinction in DSP architecture 
that affects how we can use this and many other 
architecture kernels: writing in any alignment, or 
writing only on 2 A N-word alignment. The former 
can be used to implement any U factor. The latter 
can only work with U being multiple of 2 A N. When 
U is not a multiple of 4, for example, we upscale U 
and D so that they are, at the expense of efficiency. 


OQO 

This is a write- any-alignment architecture 
implementing a 7/D resampling with 3- 
tap-per-output filters. Note that 8 outputs 
are computed and then one is thrown 
away. 


1:1 slope, 4 outputs 
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DSP with 4 parallel execution units, and they are 
fed with 4 data points, one for each. 

To be efficient for filtering, the 4 inputs need to be 
on any word alignment. 
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This for a 3/D resampling, with 4-tap-per- 
output filters. Note that 4 outputs are 
computed and then one is thrown away. 


1:2 slope, 2 outputs 

oo 

oo 


DSP that can take in 4 inputs, and perform 
acc_A = acc_A + cO*dO + cl*dl, 
acc_B = acc_B + c2*d2 + c3*d3 

To be efficient for filtering, the inputs need to be on 
any word alignment. 
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This for a 4/D resampling, with 4-tap-per- 
output filters. . 
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1:4 slope, 2 outputs 


1:2 slope, 4 outputs 
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1 : 1 slope, 8 outputs 


2:1 slope, 8 outputs 


4:1 slope, 8 outputs 




\ 


\ 







4 tall 
O 

o 
o 
o 



2:1 slope, 4 outputs 
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1 : 1 slope, 4 outputs 
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Horizontal resampling 
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Vertical resampling 
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H0 = H{4, 8,12} 




H1 = H{1,5, 9.13} 











H2 = H{2, 6, 10, 14} 













H3 = H{3, 7,11,15} 




H0 = H{4, 8, 12} 
and so on 
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